# Pre-training Optimization
Mrt5 Large
MrT5 is an efficient byte-level language model based on ByT5 improvements, reducing input sequence length by approximately 50% through dynamic token merging technology
Large Language Model
Transformers Supports Multiple Languages

M
stanfordnlp
33
2
Llama 3 70B Special Tokens Adjusted
Other
A special token adjusted version optimized based on Meta-Llama-3-70B, fixing fine-tuning issues caused by untrained special tokens in the original model
Large Language Model
Transformers

L
astronomer
33
4
Bert Mlm Medium
A medium-sized BERT language model using Masked Language Modeling (MLM) as the pre-training objective.
Large Language Model
Transformers

B
aajrami
14
0
Bert Fc Medium
A medium-scale BERT language model that uses first character prediction as the pre-training objective.
Large Language Model
Transformers

B
aajrami
16
0
Randeng Pegasus 523M Chinese
A Chinese version of the PAGASUS-large model specialized in text summarization tasks, trained on the PEGASUS architecture with optimizations for Chinese tokenization.
Text Generation
Transformers Chinese

R
IDEA-CCNL
329
12
Reasonbert TAPAS
This model is based on the tapas-base architecture, optimized for table input through pre-training, enhancing reasoning capabilities for QA tasks.
Large Language Model
Transformers

R
Anonymous
17
1
Featured Recommended AI Models